Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in FsaFromTensor for empty FSA; make index select from ragged… #481

Merged
merged 1 commit into from
Dec 9, 2020

Conversation

danpovey
Copy link
Collaborator

@danpovey danpovey commented Dec 9, 2020

… print out error cause

@danpovey
Copy link
Collaborator Author

danpovey commented Dec 9, 2020

When using this to debug the determinization failure in snowfall

[F] /ceph-dan/k2/k2/python/csrc/torch/index_select.cu:at::Tensor k2::SimpleRaggedIndexSelect1D(at::Tensor, k2::Ragged<int>&) [with T = int]:208 There must be at most one non-zero element in src fo\
r any sub-list in indexes; sub-list 30861212 has too many elements: [ 4 200004 ]

@danpovey danpovey merged commit ffc3590 into k2-fsa:master Dec 9, 2020
@danpovey
Copy link
Collaborator Author

danpovey commented Dec 9, 2020

I think we should be able to fix the determinization failure in snowfall by removing the disambiguation symbols before turning the ragged tensor of symbols into a linear one.

@qindazhu
Copy link
Collaborator

qindazhu commented Dec 9, 2020

I didn't print values before, but I have tried to remove disambiguation symbols back to that time before doing determinize, it still has the same issues. Anyway, let me try to print values with your lastest code.

@danpovey
Copy link
Collaborator Author

danpovey commented Dec 9, 2020 via email

@danpovey
Copy link
Collaborator Author

danpovey commented Dec 9, 2020 via email

@qindazhu
Copy link
Collaborator

qindazhu commented Dec 9, 2020

Yeah, that's exactly what I did before (sorry for wrong expression above), I save aux_labels and labels before determinizing, and remove code SimpleRaggedIndexSelect in python determinize, then after determinization, I called SimpleRaggedIndexSelect on the the saved axu_labels before determinize.

@qindazhu
Copy link
Collaborator

qindazhu commented Dec 9, 2020

/ceph-hw/k2/k2/python/csrc/torch/index_select.cu:at::Tensor k2::SimpleRaggedIndexSelect1D(at::Tensor, k2::Ragged<int>&) [with T = int]:208 There must be at most one non-zero element in src for any sub-list in indexes; sub-list 30860097 has too many elements: [ 4 77469 ]
@danpovey , the above exmaple is a failure (4 is word A and 77469 is word HARD) I get with below code:

diff --git a/k2/python/k2/fsa_algo.py b/k2/python/k2/fsa_algo.py
index 0c14d94..23e6449 100644
--- a/k2/python/k2/fsa_algo.py
+++ b/k2/python/k2/fsa_algo.py
@@ -316,21 +313,14 @@ def determinize(fsa: Fsa) -> Fsa:
         Otherwise, a new deterministic fsa is returned and the
         input ``fsa`` is NOT modified.
     '''
-    properties = getattr(fsa, 'properties', None)
-    if properties is not None \
-            and properties & fsa_properties.ARC_SORTED_AND_DETERMINISTIC != 0: # noqa
-        return fsa

     ragged_arc, arc_derivs = _k2.determinize(fsa.arcs)
-    aux_labels = None
-    if hasattr(fsa, 'aux_labels'):
-        aux_labels = _k2.simple_ragged_index_select(fsa.aux_labels, arc_derivs)
-    out_fsa = Fsa(ragged_arc, aux_labels)
+    out_fsa = Fsa(ragged_arc, aux_labels=None)

     for name, value in fsa.named_non_tensor_attr():
         setattr(out_fsa, name, value)

-    return out_fsa
+    return out_fsa, arc_derivs
diff --git a/snowfall/decoding/graph.py b/snowfall/decoding/graph.py
index d709e03..6d677a1 100644
--- a/snowfall/decoding/graph.py
+++ b/snowfall/decoding/graph.py

@@ -32,22 +28,27 @@ def compile_LG(
     """
     L_inv = k2.arc_sort(L.invert_())
     G = k2.arc_sort(G)
-    logging.debug("Intersecting L and G")
+    logging.info("Intersecting L and G")
     LG = k2.intersect(L_inv, G)
-    logging.debug(f'LG shape = {LG.shape}')
-    logging.debug("Connecting L*G")
+    logging.info(f'LG shape = {LG.shape}')
+    logging.info("Connecting L*G")
     LG = k2.connect(LG).invert_()
-    logging.debug(f'LG shape = {LG.shape}')
-    logging.debug("Determinizing L*G")
-    LG = k2.determinize(LG)
-    logging.debug(f'LG shape = {LG.shape}')
-    logging.debug("Connecting det(L*G)")
+    logging.info(f'LG shape = {LG.shape}')
+    aux_labels = LG.aux_labels
+    logging.info("Determinizing L*G")
+    LG, arc_derivs = k2.determinize(LG)
+    LG.labels[LG.labels >= labels_disambig_id_start] = 0
+    aux_labels[aux_labels >= aux_labels_disambig_id_start] = 0
+    logging.info('select aux_labels')
+    LG.aux_labels = k2.simple_ragged_index_select(aux_labels, arc_derivs)
+    logging.info(f'LG shape = {LG.shape}')
+    logging.info("Connecting det(L*G)")
     LG = k2.connect(LG)
     logging.debug(f'LG shape = {LG.shape}')
     logging.debug("Removing disambiguation symbols on L*G")
-    LG.labels[LG.labels >= labels_disambig_id_start] = 0
-    LG.aux_labels[LG.aux_labels >= aux_labels_disambig_id_start] = 0
     LG = k2.add_epsilon_self_loops(LG)

@danpovey
Copy link
Collaborator Author

danpovey commented Dec 9, 2020

I'd rather you call it arc_map rather than arc_derivs.
Not sure yet why this would happen; it feels like a bug.

@danpovey
Copy link
Collaborator Author

danpovey commented Dec 9, 2020

And please show me output with sizes, so I can compare with my setup.

@qindazhu
Copy link
Collaborator

qindazhu commented Dec 9, 2020

what sizes do you need? num_states of determinized fsa?

@danpovey
Copy link
Collaborator Author

danpovey commented Dec 9, 2020

OK, I did some debugging and it doesn't appear to be a bug. I was just wrong to think that this type of situation would not occur.

After thinking about various possible solutions, I think the best one is to 'cut the Gordian knot' and simply abandon the assumption that the aux_labels will be at most one per frame. That is: to leave it as a Ragged tensor. When doing further operations like epsilon removal, we need to use the appropriate method of indexing a ragged tensor with a ragged tensor, that will append the input lists. (I'll have to check whether the operation already exists, but it's basically indexing the old ragged tensor with the values of the index, using ComposeRaggedShapes to compose the shapes, removing axis 1, and using that as the shape of the output ragged object.

I think we should probably create a RaggedTensor object, that holds a RaggedShape and a Tensor, for these kinds of purposes.

@qindazhu
Copy link
Collaborator

indexing a ragged tensor with a ragged tensor

Ah, OK. It sounds like ComposeArcMaps? I was thought we need something like index (aux) array with a ragged array (arc_map).

BTW, do you think we should do something like expand sequence of aux_labels from Ragged to 1-D Array currently? as we do int host invertion.

@danpovey
Copy link
Collaborator Author

danpovey commented Dec 10, 2020 via email

@qindazhu
Copy link
Collaborator

I mean, for example, if one arc has two aux_labels, say [aux1, aux2], then we expand the arc to two arcs, one arc is (arc.src_state, temp_state, label=0, aux1, score=0), another arc is (temp_state, arc.dest_state, arc.label, aux2, arc.score). We expand all arcs so that we get a new Fsa with only one aux_label one arc.

@danpovey
Copy link
Collaborator Author

danpovey commented Dec 10, 2020 via email

@danpovey
Copy link
Collaborator Author

How about we make so the tensor_attr's can actually contain not just Tensor but also RaggedInt? I think that might be the path of least resistance.

@qindazhu
Copy link
Collaborator

Agree, as RaggedInt has the same Dim0 with all tensor_attr's in the fsa(vec), i.e. the number of arcs.

BTW, for case that one arc has more than two aux_labels, we need to keep the order of arc-map values in each row of the RaggedInt, seems we have keep that in host version,

// The following is mostly for ease of interpretability of the output;
// conceptually the order makes no difference.
// TODO(dpovey): maybe remove this, for efficiency?
std::reverse(deriv_out->begin(), deriv_out->end());

Just mention this in case we forget it in the future, as for non-top-sorted version, the order may be non-ascending?

@danpovey
Copy link
Collaborator Author

danpovey commented Dec 10, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants